Comparing Boolean and Probabilistic Information Retrieval Systems across Queries and Disciplines
نویسنده
چکیده
Whether using Boolean queries or ranking documents using document and term weights will result in better retrieval performance has been the subject of considerable discussion among document retrieval system users and researchers. We suggest a method that allows one to analytically compare the two approaches to retrieval and examine their relative merits. The performance of information retrieval systems may be determined either by using experimental simulation, or through the application of analytic techniques that directly estimate the retrieval performance, given values for query and database characteristics. Using these performance predicting techniques, sample performance figures are provided for queries using the Boolean and and or, as well as for probabilistic systems assuming statistical term independence or term dependence. The variation of performance across sublanguages (used in different academic disciplines) and queries is examined. The performance of models failing to meet statistical and other assumptions is examined.
منابع مشابه
Improved Skips for Faster Postings List Intersection
Information retrieval can be achieved through computerized processes by generating a list of relevant responses to a query. The document processor, matching function and query analyzer are the main components of an information retrieval system. Document retrieval system is fundamentally based on: Boolean, vector-space, probabilistic, and language models. In this paper, a new methodology for mat...
متن کاملImproved Skips for Faster Postings List Intersection
Information retrieval can be achieved through computerized processes by generating a list of relevant responses to a query. The document processor, matching function and query analyzer are the main components of an information retrieval system. Document retrieval system is fundamentally based on: Boolean, vector-space, probabilistic, and language models. In this paper, a new methodology for mat...
متن کاملPublic Transport Ontology for Passenger Information Retrieval
Passenger information aims at improving the user-friendliness of public transport systems while influencing passenger route choices to satisfy transit user’s travel requirements. The integration of transit information from multiple agencies is a major challenge in implementation of multi-modal passenger information systems. The problem of information sharing is further compounded by the multi-l...
متن کاملUsing Structured Queries for Disambiguation in Cross-Language Information Retrieval
Bilingual transthr dictionaries are an important resource for query translation in cross-language text retrieval. However, term translation is not an isomorphic process, so dictionary-based systems must address the problem of ambiguity in language translation. In this paper, we claim that boolea~l conjunction (the AND operator) provides siml)le and automatic disambiguation in the target languag...
متن کاملEffective Information Retrieval Method Based on Matching Adaptive Genetic Algorithm
Information Retrieval (IR) System is very complex in nature due to the complex interactions between documents and queries, which means that the matching of document representations and query representations is not straightforward. The Genetic Algorithm (GA) is widely used in IR systems to improve the effectiveness such systems. This study uses the Vector Space Model (VSM) and the Extended Boole...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- JASIS
دوره 48 شماره
صفحات -
تاریخ انتشار 1997